Skip to content

feat: add top-level kNN search via Query\Knn and Query::setKnn()#2314

Open
Cryde wants to merge 1 commit into
ruflin:9.xfrom
Cryde:feature/knn-top-level-search
Open

feat: add top-level kNN search via Query\Knn and Query::setKnn()#2314
Cryde wants to merge 1 commit into
ruflin:9.xfrom
Cryde:feature/knn-top-level-search

Conversation

@Cryde

@Cryde Cryde commented May 18, 2026

Copy link
Copy Markdown

Adds an Elastica\Query\Knn class wrapping Elasticsearch's top-level knn
search clause, plus Query::setKnn() to attach a single Knn or a list of
Knn (multiple kNN searches in a single request) as a sibling of query.

  • Knn supports field, query_vector, k, num_candidates plus optional
    pre-filters (addFilter), similarity threshold and boost.
  • Query::toArray() no longer auto-pads with match_all when only knn is set.

Summary by CodeRabbit

  • New Features

    • Top-level kNN (k-nearest neighbors) query support for vector similarity searches.
    • Support for single or multiple top-level kNN queries.
    • Options to apply pre-filters, set similarity threshold, and apply boost weights to kNN queries.
  • Tests

    • Added unit and functional tests covering kNN serialization, filter integration, multi-kNN handling, and end-to-end vector search.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Adds top-level kNN query support: new Elastica\Query\Knn builder, Query::setKnn() integration with updated raw-query type and default-query logic, unit and functional tests, and a changelog entry.

Changes

kNN Query Feature

Layer / File(s) Summary
Knn Query Builder Class
src/Query/Knn.php
New Knn class extends Param, sets required params (field, query_vector, k, num_candidates) in the constructor and provides addFilter(), setSimilarity(), and setBoost() fluent setters.
Query Class Integration
src/Query.php
Imports Knn; expands @phpstan-type TRawQuery with optional knn; updates toArray() to avoid injecting match_all when knn is present; adds `setKnn(Knn
Test Suite
tests/Query/KnnTest.php
Unit tests validate Knn->toArray() (basic and with filters/similarity/boost), Query::setKnn() behavior for single and multiple knn entries, and a functional dense_vector kNN search with a tag filter.
Documentation
CHANGELOG.md
Adds [Unreleased] changelog entry documenting top-level kNN support via Elastica\Query\Knn and Query::setKnn() with optional filters, similarity, and boost.

Sequence Diagram

sequenceDiagram
  participant Client
  participant Query
  participant Knn
  participant Elasticsearch
  Client->>Query: setKnn(Knn with filters)
  Query->>Knn: toArray()
  Knn-->>Query: serialized knn parameter(s)
  Query->>Query: store at top-level "knn"
  Query-->>Client: Query object (fluent)
  Client->>Elasticsearch: execute request containing knn
  Elasticsearch-->>Client: matching documents
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I stitched a Knn with vectors snug and light,
Filters and boosts to tune the search just right,
setKnn() tucks them tidy at the request's top,
Tests hop forward, proving matches never stop,
May your queries find carrots, shiny and bright.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: introducing top-level kNN search functionality via a new Query\Knn class and Query::setKnn() method. The title is concise, specific, and directly reflects the primary purpose of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Query.php`:
- Line 56: TRawQuery's knn union type is too permissive and allows object forms
that can become nested (knn.knn) when serialized; change the TRawQuery['knn']
definition to accept only normalized array shapes (e.g.,
array<string,mixed>|list<array<string,mixed>>) and remove Knn|list<Knn> from
that raw type, or alternatively add a normalization step in the
raw-to-normalized conversion path (the function/method that builds/serializes
TRawQuery) to detect Knn objects and convert them into the flat array form
before serialization; update any references to TRawQuery and the knn handling
code to use the normalized array form so raw queries never contain nested knn
objects.
- Line 511: Update the setKnn method signature to use a native integer type for
its parameter (change setKnn($knn) to setKnn(int $knn): self) and adjust any
callers to pass an int; also update or add PHPDoc on Query::setKnn to reflect
the typed parameter if needed. Ensure the method body and any
assignments/validations inside the setKnn implementation accept and operate on
an int value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8551ea06-cc17-4fcf-90b0-2fdac223ebac

📥 Commits

Reviewing files that changed from the base of the PR and between bbd7828 and e32cf3d.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • src/Query.php
  • src/Query/Knn.php
  • tests/Query/KnnTest.php

Comment thread src/Query.php Outdated
Comment thread src/Query.php Outdated
@Cryde Cryde force-pushed the feature/knn-top-level-search branch 2 times, most recently from 3824342 to c10a45a Compare May 18, 2026 14:11

@ruflin ruflin left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. What is your take on the KNN being it's own object not extending Query (as it isn't one 🤔 )

Comment thread src/Query/Knn.php
/**
* Top-level kNN search.
*
* Note: `knn` is a sibling of `query` in the search request body, not a clause inside it.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the part I stumble over. Should we make it part of Query or make it it's own top level thing extending Parms instead?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, knn is a sibling of query in the search body.

Nothing today prevents a user from passing a Knn to Query::setQuery() or inside a BoolQuery, where it would silently produce an invalid request.

I'll switch to extending Param indeed. Pushing the change shortly !

Comment thread src/Query/Knn.php Outdated
*
* @see https://www.elastic.co/docs/solutions/search/vector/knn
*/
class Knn extends AbstractQuery

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above

Adds an Elastica\Query\Knn class wrapping Elasticsearch's top-level `knn`
search clause, plus Query::setKnn() to attach a single Knn or a list of
Knn (multiple kNN searches in a single request) as a sibling of `query`.

- Knn supports field, query_vector, k, num_candidates plus optional
  pre-filters (addFilter), similarity threshold and boost.
- Query::toArray() no longer auto-pads with match_all when only knn is set.
@Cryde Cryde force-pushed the feature/knn-top-level-search branch from c10a45a to 72d0ea0 Compare May 21, 2026 12:17
@Cryde

Cryde commented May 21, 2026

Copy link
Copy Markdown
Author

Also maybe I should move to Elastica\Knn rather than Elastica\Query\Knn ?

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Query.php`:
- Around line 511-517: The setKnn method currently accepts any array and then
maps assuming list<Knn>; update setKnn(Knn|array $knn) to validate array inputs
before mapping: if $knn is an array ensure it is a non-empty list (no
associative keys), and each element is an instance of Knn; if validation fails
throw InvalidException with a clear message like "setKnn expects a non-empty
list of Knn instances"; after validation proceed to map each Knn to its
->toArray()['knn'] value (preserving existing mapping behavior) so only valid
Knn payloads are produced.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1596fbd3-02fa-4d08-87ef-59e30cafa20f

📥 Commits

Reviewing files that changed from the base of the PR and between c10a45a and 72d0ea0.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • src/Query.php
  • src/Query/Knn.php
  • tests/Query/KnnTest.php
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md

Comment thread src/Query.php
Comment on lines +511 to +517
public function setKnn(Knn|array $knn): self
{
if (\is_array($knn)) {
$value = \array_map(static fn (Knn $k): array => $k->toArray()['knn'], $knn);
} else {
$value = $knn->toArray()['knn'];
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate setKnn() array input before mapping.

Line 513 accepts any array, but the mapper assumes list<Knn>. Empty/non-list/non-Knn entries fail late (or build invalid knn payloads). Add explicit guards and throw InvalidException with a clear message.

Proposed fix
 public function setKnn(Knn|array $knn): self
 {
     if (\is_array($knn)) {
+        if ([] === $knn || !\array_is_list($knn)) {
+            throw new InvalidException('Knn must be a non-empty list of Knn instances.');
+        }
+        foreach ($knn as $entry) {
+            if (!$entry instanceof Knn) {
+                throw new InvalidException('Each knn entry must be an instance of '.Knn::class.'.');
+            }
+        }
         $value = \array_map(static fn (Knn $k): array => $k->toArray()['knn'], $knn);
     } else {
         $value = $knn->toArray()['knn'];
     }

     return $this->setParam('knn', $value);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Query.php` around lines 511 - 517, The setKnn method currently accepts
any array and then maps assuming list<Knn>; update setKnn(Knn|array $knn) to
validate array inputs before mapping: if $knn is an array ensure it is a
non-empty list (no associative keys), and each element is an instance of Knn; if
validation fails throw InvalidException with a clear message like "setKnn
expects a non-empty list of Knn instances"; after validation proceed to map each
Knn to its ->toArray()['knn'] value (preserving existing mapping behavior) so
only valid Knn payloads are produced.

@ruflin

ruflin commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Argh, I missed the ping.

Also maybe I should move to Elastica\Knn rather than Elastica\Query\Knn ?

I was thinking the same. i would like to limit the high level objects but likely makes more sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants